Sains Malaysiana 53(6)(2024): 1463-1476

http://doi.org/10.17576/jsm-2024-5306-18

 

Spatial Functional Outlier Detection in Multivariate Spatial Functional Data

(Pengesanan Outlier Fungsian Reruang dalam Data Fungsian Reruang Multivariat)

 

NUR FATIHAH MOHD ALI1, ROSSITA MOHAMAD YUNUS1,*, IBRAHIM MOHAMED1 & FARIDAH OTHMAN2

 

1Institute of Mathematical Sciences, Faculty of Science, Universiti Malaya, 50603 Kuala Lumpur, Malaysia

2Department of Civil Engineering, Faculty of Engineering, Universiti Malaya, 50603 Kuala Lumpur, Malaysia

 

Received: 11 January 2024/Accepted: 23 May 2024

 

Abstract

Multivariate spatial functional data consists of multiple functions of time-dependent attributes observed at each spatial point. This study focuses on detecting spatial outliers in spatial functional data. Firstly, we develop a new method called Mahalanobis Distance Spatial Outlier (MDSO) to detect functional outliers in the data. The method introduces the multivariate functional Mahalanobis semi-distance and multivariate pairwise functional Mahalanobis semi-distance metrics based on the multivariate functional principal components analysis to calculate the dissimilarity between functions at each spatial point. Via simulation, we show that MDSO performs better than the other competing methods. Secondly, MDSO has been extended to detect spatial functional outliers as well. The functional outliers can now be categorized as global or/and local functional outliers. The appropriate number of neighbors and the cut-off point for the degree of isolation are determined via simulation. Finally, we demonstrate the application of the MDSO on a water quality data set obtained from Sungai Klang basin in Malaysia. The results can be used to support the authority in making better decisions on the management of the river basin or other spatial data with time-independent attributes.

 

Keywords: Functional Mahalanobis distance; multivariate functional data; spatial outlier; water quality

 

Abstrak

Data reruang multivariat berfungsi adalah terdiri daripada pelbagai atribut berfungsi mengikut masa yang dicerap bagi setiap titik reruang. Kajian ini mengutamakan pengesanan reruang terpencil dalam data reruang berfungsi. Pertama, kajian ini membangunkan kaedah baharu yang dikenali sebagai Jarak Mahalanobis Reruang Terpencil (JMRT) untuk mengesan fungsi terpencil dalam data. Kaedah ini memperkenalkan penganggar separa multivariat Mahalanobis berfungsi dan penganggar separa multivariat Mahalanobis berfungsi berpasangan berdasarkan analisis komponen utama multivariat berfungsi bagi menghitung perbezaan antara fungsi pada setiap titik reruang. Melalui simulasi, kajian menunjukkan bahawa prestasi JMRT lebih baik berbanding daripada kaedah lain. Kedua, kaedah JMRT dilanjutkan untuk mengesan reruang terpencil berfungsi. Fungsi terpencil yang sedia ada boleh dikategorikan kepada pencilan global dan/atau lokal berfungsi. Bilangan jiran dan titik potong bagi darjah keberasingan yang sesuai ditentukan melalui simulasi. Akhirnya, kami mengadaptasi aplikasi kaedah JMRT terhadap data kualiti air yang diambil dari lembangan Sungai Klang di Malaysia. Hasil keputusan dapat membantu pihak berwajib dalam membuat keputusan yang lebih baik untuk menguruskan lembangan sungai dan menguruskan data reruang yang bergantung terhadap masa.

 

Kata kunci: Data multivariat berfungsi; kualiti air; penganggar Mahalanobis berfungsi; ruang terpencil

 

REFERENCES

Aristizabal, J.P., Giraldo, R. & Mateu, J. 2019. Analysis of variance for spatially correlated functional data: Application to brain data. Spatial Statistics 32: 100381.

Arribas-Gil, A. & Romo, J. 2014. Shape outlier detection and visualization for functional data: The outliergram. Biostatistics 15(4): 603-619.

Berrendero, J.R., Bueno-Larraz, B. & Cuevas, A. 2020. On Mahalanobis distance in functional settings. The Journal of Machine Learning Research 21(1): 288-320.

Claeskens, G., Hubert, M., Slaets, L. & Vakili, K. 2014. Multivariate functional halfspace depth. Journal of the American Statistical Association 109(505): 411-423.

Dai, W. & Genton, M.G. 2018. Multivariate functional data visualization and outlier detection. Journal of Computational and Graphical Statistics 27(4): 923-934.

Delicado, P., Giraldo, R., Comas, C. & Mateu, J. 2010. Statistics for spatial functional data: Some recent contributions. Environmetrics: The Official Journal of the International Environmetrics Society 21(3‐4): 224-239.

Febrero, M., Galeano, P. & González‐Manteiga, W. 2008. Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels. Environmetrics: The Official Journal of the International Environmetrics Society 19(4): 331-345.

Filzmoser, P., Ruiz-Gazen, A. & Thomas-Agnan, C. 2014. Identification of local multivariate outliers. Statistical Papers 55: 29-47.

Galeano, P., Joseph, E. & Lillo, R.E. 2015. The Mahalanobis distance for functional data with applications to classification. Technometrics 57(2): 281-291.

Golovkine, S., Klutchnikoff, N. & Patilea, V. 2021. Adaptive optimal estimation of irregular mean and covariance functions. arXiv preprint arXiv:2108.06507.

Happ, C. & Greven, S. 2018. Multivariate functional principal component analysis for data observed on different (dimensional) domains. Journal of the American Statistical Association 113(522): 649-659.

Haslett, J. 1992. Spatial data analysis - challenges. Journal of the Royal Statistical Society Series D: The Statistician 41(3): 271-284.

Hubert, M., Rousseeuw, P.J. & Segaert, P. 2015. Multivariate functional outlier detection. Statistical Methods & Applications 24(2): 177-202.

Ieva, F. & Paganoni, A.M. 2013. Depth measures for multivariate functional data. Communications in Statistics-Theory and Methods 42(7): 1265-1276.

López-Pintado, S., Sun, Y., Lin, J.K. & Genton, M.G. 2014. Simplicial band depth for multivariate functional data. Advances in Data Analysis and Classification 8: 321-338.

Mateu, J. & Giraldo, R. 2021. Geostatistical Functional Data Analysis. New York: John Wiley & Sons.

Ojo, O., Fernández Anta, A. & Lillo, R.E. 2019. Improvements to the Massive Unsupervised Outlier Detection (MUOD) algorithm. In III International Workshop on Advances in Functional Data Analysis.

Rousseeuw, P.J., Raymaekers, J. & Hubert, M. 2018. A measure of directional outlyingness with applications to image data and video. Journal of Computational and Graphical Statistics 27(2): 345-359.

Sun, Y. & Genton, M.G. 2011. Functional boxplots. Journal of Computational and Graphical Statistics 20(2): 316-334.

 

*Corresponding author; email: rossita@um.edu.my

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

previous